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Identifying  and  Perceiving  Environmental  Sounds 

Few  details  are  known  about  how  we  identify  and  perceive  everyday 
sounds.  This  is  surprising  given  the  ubiquitous  presence  of  these  sounds  and  their 
important  functional  role.  It  is  further  surprising  given  that  listeners  have  probably 
developed  a  vast  amount  of  knowledge  about  the  sounds.  Knowledge  about  a 
sound  would  include  knowledge  of  its  spectral  and  temporal  attributes,  knowledge 
of  perceptual  characteristics  of  the  sound,  verbal  labels  for  the  sound,  and  of 
course,  knowledge  of  the  cause  of  the  sound.  Unfortunately,  this  tentative  listing  of 
what  the  listener  knows  about  environmental  sound  is  based  not  upon  a  theory  of 
how  these  sounds  are  perceived.  Such  a  theory  does  not  exist,  and  the  types  of 
knowledge  listed  come  from  the  types  of  research  has  been  done  on  these  sounds. 
Unfortunately,  the  research  is  scattered  in  its  methods  and  its  selection  of  stimuli. 
Studies  that  include  a  diversity  of  sounds  examined  from  perceptual  and  acoustic 
perspectives  would  begin  to  reveal  details  of  how  everyday  sounds  are  identified. 
This  technical  report  is  a  study  of  the  perceptual-cognitive  judgments  listeners 
made  about  a  set  of  41  environmental  sounds  and  the  relationship  between  these 
judgments  and  acoustic  measures.  The  judgments  included  timed  identifications 
and  perceptual-cognitive  evaluations  of  the  sounds. 

Knowledge  about  environmental  sounds  includes  knowledge  of  spectral 
and  temporal  attributes.  Much  has  been  learned  about  the  acoustics  of  everyday 
sounds  through  acoustic  analysis.  Analyses  to  date  suggest  that  the  attributes 
used  to  identify  a  sound  are  idiosyncratic  to  the  sound.  For  example,  the 
distinguishing  acoustic  pattern  of  a  breaking  bottle  is  the  asynchronous  impulses 
produced  by  the  individual  pieces  bouncing  after  breakup  of  the  bottle  (Warren  & 
Verbrugge,  1984).  A  bouncing  bottle  produces  a  series  of  discrete  impulses  that 
are  damped  in  amplitude.  Acoustic  analysis  of  agriculture  machinery  indicated  that 
a  high  band  spectrum,  325-3500  Hz,  was  more  informative  to  the  users  about 
engine  load  than  a  lower  band,  20-200  Hz,  {Talamo,1982).  Repp  (1987)  found  that 
spectral  peaks  of  hand  claps  were  related  to  hand  configuration  during  the  clap. 
Halpern,  Blake,  and  Hillenbrand  (1986)  found  that  a  scraping  sound  similar  to  a 
fingernail  across  a  blackboard  became  less  chilling  as  the  low  frequencies  were 
filtered,  suggesting  that  the  low  spectrum  produced  the  discomfort  of  a  chilling 
sound.  Gaver  (1986)  found  that  impacting  wood  and  metal  objects-as  well  as  the 
lengths  of  the  objects-can  be  discriminated  using  spectral  attributes.  These 
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examples  demonstrate  that  accurate  identification  of  a  sound  depends  upon  the 
presence  of  attributes  that  are  specific  to  the  production  of  the  sound.  On  the  other 
hand,  Vanderveer  (1979)  concluded  that  when  multiple  causes  produce  similar 
effects,  then  identification  is  compromised.  According  to  her,  this  condition  often 
exists  in  identifying  the  types  of  objects  involved  in  an  impact. 

Although  acoustic  analysis  is  important  in  understanding  environmental 
sound  identification,  a  focus  on  acoustic  attributes  alone  might  produce  limited 
results.  The  production  mechanisms  of  everyday  sounds  are  vast  and  the 
subsequent  acoustic  attributes  unconstrained  by  a  common  production  mechanism 
as  is  the  case  with  speech.  Thus  it  is  unlikely  that  an  underlying  set  of  acoustic 
features  common  to  a  variety  of  sounds  will  be  found. 

Listeners  identify  sounds  with  verbal  labels  and  this  has  received  seme 
attention.  Bartlett  (1977)  found  that  verbal  labeling  improves  both  free  recall  of 
sounds  and  recognition  of  sounds  previously  presented.  However,  the  facilitative 
effects  of  labeling  require  consistent  labeling  of  the  sounds.  The  effects  of 
consistent  labeling  might  be  due  to  an  elimination  of  the  effect  of  causal 
uncertainty.  Consistent  labeling  would  constrain  the  set  of  alternative  causes  of  a 
sound  to  a  single  cause,  eliminating  uncertainty  and  enhancing  recognition 
performance.  Consistent  with  this  interpretation,  Lawrence  (1979)  found  that 
recognition  performance  improved  if  participants  were  given  an  opportunity  to 
review  the  labels  they  had  produced  previously.  The  review  would  explicitly 
constrain  the  set  of  alternative  causes.  Other  studies  of  labeling  environmental 
sounds  have  compared  memory  for  sounds  to  memory  for  the  labels  of  these 
sounds  (Miller  &  Tanis,  1971;  Paivio,  Philipchalk,  &  Rowe,  1975).  Both  recognition 
and  recall  memory  have  been  compared.  Generally,  recall  is  bettei  for  labels  and 
there  is  little  difference  in  recognition. 

Finally,  a  few  studies  have  asked  for  perceptual  judgments  about  everyday 
sounds.  These  judgments  are  typically  ratings  of  the  sounds  on  semantic 
differential  scales  which  are  then  factor  analyzed  (e.g.,  Bjork,  1985;  Solomon, 

1958;  Von  Bismarck,  1974).  The  semantic  scales  that  have  emerged  from  these 
studies  include  loud-soft,  soft-hard,  round-angular,  dull-sharp,  relaxed-tense, 
pleasant-unpleasant,  interesting-dull,  and  compact-scattered.  Some  of  these 
scales  characterize  the  timbre  perception  of  everyday  sounds,  but  others  may  tap 
affective  judgments.  Solomon  (1959a,  1959b)  and  Bjork  (1985)  have  had  some 
success  in  relating  these  judgments  to  acoustic  attributes  of  the  sounds. 
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There  are  limitations  to  the  studies  that  have  been  done  on  everyday  sound 
perception.  Most  of  the  studies  have  focused  on  a  limited  set  of  sounds,  and  none 
have  collected  acoustic,  perceptual,  and  cognitive  data  to  assess  the  role  of  all 
three  in  identification  of  everyday  sounds.  The  data  analyzed  in  this  papei  include 
data  in  all  three  domains,  on  a  set  of  41  sounds  that  include  very  different  types  of 
sounds  in  order  to  broaden  our  understanding  of  the  perception  of  this  type  of 
sound.  Acoustics  of  these  sounds  were  analyzed,  perceptual-cognitive  judgments 
about  the  sounds  were  obtained,  and  identification  responses  wore  analyzed  for 
uncertainty  and  accuracy. 

There  is  special  attention  to  the  identification  time  of  the  cause  of  a  sound 
and  how  this  duration  is  related  to  the  stereotypy  of  the  sound  and  the  probability  of 
alternative  causes  for  the  sound.  An  example  of  alternative  causation  is  that  a 
"click-click"  can  be  produced  by  a  bail-point  pen,  a  light  switch,  certain  types  of 
staplers,  and  a  camera,  to  name  a  few  alternatives.  Balias,  Sliwinski,  and  Harding 
(1986)  found  that  the  log  of  the  mean  time  to  identify  (LMIT)  an  everyday 
environmental  sound  was  a  function  of  the  logarithm  of  the  number  of  alternatives 
that  were  given  as  causes  for  the  sound.  This  finding  is  similar  to  the  Hick-Hyman 
law  for  choice-reaction  time  (Hick,  1 952;  Hyman,  1 953).  It  raises  several  questions 
about  the  cognitive  process  involved  in  the  consideration  of  alternative  causes. 
What  alternatives  are  considered?  How  are  they  related?  Which  aspects  of  the 
alternatives  qualify  them  for  consideration?  An  important  question  is  how  to 
quantify  alternative  causation  so  that  its  effect  on  performance  can  be  determined. 
Balias  and  Sliwinski  (1986)  used  the  information  measure  HXo  quantify  the  causal 
uncertainty  of  41  sounds.  Their  calculation  was  actually  a  measure  of  response 
equivocation  in  identifying  a  sound.  The  actual  identification  responses  given  by 
the  listeners  were  sorted  to  determine  how  many  different  responses  were  given. 
The  number  of  different  responses  was  used  to  determine  the  number  of 
alternatives  and  the  relative  frequencies  of  these  alternatives  was  used  to  estimate 
the  conditional  probability  of  the  alternatives.  An  extended  discussion  of  this 
application  of  the  information  measure  is  given  in  Balias  and  Sliwinski  (1986). 

The  first  experiment  in  Balias  and  Sliwinski  (1986)  was  conducted  to 
determine  the  causal  uncertainty  values  and  identification  response  times  for  a  set 
of  sounds.  Forty-one  sounds  (described  in  Table  1  and  Appendix  A,  with 
waveforms  in  Appendix  C)  were  obtained  from  sound-effects  records  to  represent  a 
variety  of  environmental  sounds  but  at  the  same  time,  to  pose  both  easy  and 
difficult  identification  problems,  were  digitized,  and  determined  to  be  subjectively 
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good  representations  of  the  events  causing  the  sound.  A  discrimination 
experiment  confirmed  that  the  sounds  were  discriminable  from  each  other.  In  this 
study,  two  listeners  heard  each  of  the  820  combinations  of  the  41  sounds  in  an 
ABX  paradigm.  The  order  for  each  combination  was  determined  randomly,  and  the 
combinations  were  presented  in  random  order.  Feedback  was  presented. 
Performance  was  99.8%  for  each  listener,  which  was  only  two  errors  in  820 
judgments.  None  of  the  combinations  on  which  errors  were  made  were  similar  for 
the  two  listeners.  Both  listeners  reported  that  the  errors  resulted  from  a  lapse  in 
attention. 

Balias  and  Sliwinski  (1986)  presented  the  sounds  at  a  comfortable  listening 
level  in  random  order  to  listeners  who  were  asked  to  identify  the  sounds.  The 
identification  responses  were  sorted  by  two  research  assistants  and  a  third  person 
who  was  unfamiliar  with  the  research  hypothesis.  This  third  sorter  was  a 
professional  technical  writer.  All  three  individuals  sorted  the  responses  into 
categories  of  similar  events.  Responses  that  were  identical,  synonyms,  or  that 
described  the  same  physical  scene  were  binned  together.  These  sortings  were 
then  used  to  compute  the  uncertainty  statistic  using  the  equation; 

n 

Hy  =  X  Pji  loQ2  Pit 

where  Hj  is  the  measure  of  causal  uncertainty  for  sound  j,  pjj  is  the 
proportion  of  ail  identification  responses  for  sound  /  sorted  into  event  category  / 
and  n  is  the  number  of  categories  for  the  identification  responses  to  sound  /.  Three 
sets  of  uncertainty  values  were  computed,  one  for  each  of  the  three  sorters.  The 
reliabilities  of  the  three  sorters  were  significant,  r^^^g)  =  -^5,  =  .87,  r^g&s)  =  -87,  p 

<  .0001.  The  median  uncertainty  value  for  each  sound  was  used  in  the 
analyses  in  this  paper.  In  this  paper,  this  measure  of  causal  uncertainty  is  related 
to  perceptual-cognitive  judgments  and  acoustic  attributes  of  the  same  sounds. 

In  order  to  evaluate  the  role  of  perceptual-cognitivejudgments  in  the 
identification  of  the  sounds,  listeners  were  asked  to  rate  the  sounds  on  perceptual 
and  cognitive  scales.  The  scales  used  in  this  study  were  derived  from  a  review  of 
the  scales  used  in  the  timbre  studies  and  in  verbal  research.  Perceptual  ratings  of 
the  timbre  of  the  41  sounds  were  obtained  using  scales  taken  from  previous 
studies  (e.g.,  Solomon,  1958;  Von  Bismarck,  1974;  Bjork,  1985).  Some  of  the 
scales  that  have  emerged  from  these  studies  include  loud-soft,  soft-hard,  round- 
angular,  dull-sharp,  relaxed-tense,  pleasant-unpleasant,  interesting-dull,  and 
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compact-scattered..  Some  success  has  been  achieved  in  relating  the  scales  to 
acoustic  attributes. 

Cognitive  rating  scales  were  used  to  solicit  the  listener  judgments  in  a 
manner  similar  to  how  ratings  have  been  used  to  assess  verbal  materials  on 
category  size  (Battig  &  Montague,  1969),  goodness  of  example  (Rosch,  1975), 
meaningfulness  and  association  value  (Noble,  Stockwell,  &  Pryor,  1957), 
concreteness  and  specificity  (Spreen  &  Schulz,  1966).  Comparable  data  do  not 
exist  for  everyday  sounds  even  though  these  sounds  have  cognitive  attributes. 
Some  of  the  scales  requested  judgments  about  the  perceived  cause  of  the  sound. 

In  these,  a  further  distinction  was  made  between  the  action  and  the  agents  involved 
because  Vanderveer  (1979)  found  that  the  action  of  a  cause  was  more  accurately 
identified  than  the  agent. 


Method 

Stimuli  The  set  of  41  sounds  from  Balias  &  Sliwinski  (1 986)  was  used.  The 
duration  of  the  sounds  was  inaccurately  reported  in  their  report.  The  actual 
duration  varied  for  the  sounds,  but  was  a  maximum  of  .625  s.  The  sample  rate  in 
digitizing  and  generating  the  sounds  was  16  kHz. 

Listeners.  Twenty  college  students  were  listeners  in  this  experiment  and 
were  paid  or  received  class  credit  for  their  participation. 

Rating  Scales.  Twenty-two  rating  scales  (see  Appendix  B)  were  constructed 
using  themes  that  had  been  found  to  be  important  in  previous  research  on 
environmental  sound  and  in  verbal  research.  Listeners  were  also  asked  to  rate  the 
identifiability  of  the  sound,  and  to  classify  the  sound  in  terms  of  Caver's  (1986) 
scheme  which  is  based  upon  the  type  of  mapping  between  a  sound  and  its 
meaning.  He  suggests  three  types  of  mappings-symbolic,  metaphorical,  and 
nomic-and  develops  the  implications  of  each  type  in  the  use  of  natural  sound  in 
computer  interfaces. 

Procedure.  Participants  were  tested  individually  by  interacting  with  a 
microcomputer  which  presented  stimuli  and  collected  responses  on  a  standard 
keyboard.  A  trial  was  initiated  by  pressing  the  space  bar.  A  sound  was  then 
played  through  earphones.  Participants  then  rated  the  sound  on  each  of  the 
scales,  always  having  the  option  to  hear  the  sound  again.  The  sounds  were 
presented  in  random  order.  The  order  of  the  ratings  was  fixed.  Breaks  were  given 
after  the  fourteenth  and  twenty-eighth  sound  to  offset  fatigue. 
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Further  Analyses  of  Balias  &  Sliwinski 


Balias  &  Sliwinski  did  not  report  data  on  identification  accuracy.  In  an 
reanalysis  of  the  data,  identification  accuracy  was  calculated  for  each  sound  taking 
as  accurate  any  response  that  met  criteria  used  by  Vanderveer  (1979).  Briefly, 
these  criteria  specify  a  response  as  correct  if  it  provided  a  reference  to  the 
generating  event  or  to  a  class  of  events  that  would  include  the  generating  event. 

Balias  and  Sliwinski  included  only  limited  acoustic  analyses  in  their  report. 
The  following  acoustic  parameters  were  computed  to  describe  the  acoustics  of  the 
sounds.  It  is  recognized  that  these  parameters  might  not  describe  important 
temporal  variations  in  the  sounds.  Some  of  the  temporal  attributes  would  be 
idiosyncratic,  and  not  be  computable  for  the  full  set  of  sounds.  This  was  even  the 
case  for  other  spectral  attributes  (e.g.,  fundamental  frequency)  that  were 
considered  but  not  used  in  these  analyses. 

Sound  length.  The  duration  of  the  sound. 

Average  magnitude.  The  average  absolute  voltage  level  of  the  sound. 

Peak  magnitude.  The  maximum  voltage  level  of  the  sound. 

Power.  The  average  power  of  the  sound  in  dB. 

Average  FFT  spectrum.  The  FFT  spectrum  of  the  sound  averaged  from  a  moving 
FFT  analysis  of  24  ms  Hanning  windows,  shifted  at  1 2  ms  increments.  The 
frequency  resolution  of  the  FFT  was  40.7  Hz. 

Maximum  spectrum  magnitude.  The  maximum  value  of  the  average  FFT  spectrum, 
in  dB  units. 

Maximum  spectrum  frequency.  The  frequency  of  the  FFT  spectrum  component  with 
the  maximum  magnitude. 

Moments  of  the  average  FFT  spectrum.  The  average  spectrum  was  treated  as  a 
distribution,  and  second,  third  and  fourth  central  moments  of  this  distribution 
were  computed  (Chen,  1983).  Skewness  and  kurtosis  of  this  distribution  were 
calculated  from  these  moments. 

1/3  octave  band  spectrum  computed  by  filtering  the  sound  with  1/3  octave,  five-pole 
Butterworth  bandpass  digital  filters,  and  integrating  the  power  out  of  each  filter. 
Seventeen  bands  with  center  frequencies  of  200,  250,  31 5,  400,  500,  630,  800, 
1000,  1250,  1600,  2000,  2500,  3150,  4000,  5000,  6300,  and  8000  Hz  were 
used.  These  spectra  are  presented  in  Appendix  C.  Bands  lower  than  200  Hz 
were  dominated  by  noise  which  was  probably  due  to  record  surface  noise 
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(Alexandrovich,  1987)  and  were  filtered  out.  Results  of  the  spectral  analysis 
reported  later  were  similar  when  a  lower  band  (160  Hz  center  frequency)  was 
included.  The  1/3  octave  spectra  for  the  sounds  were  verified  by  comparing 
these  spectra  to  an  approximation  of  the  1/3  octave  bands  obtained  by 
combining  components  of  the  FFT  spectrum,  and  by  transporting  several  of  the 
sounds  to  a  computer  running  the  ILS  signal  processing  software  and 
analyzing  the  1/3  octave  spectra  with  this  software. 

Results 

Principal  Components  Analysis  of  Spectra. 

The  1/3  octave  band  spectra  were  analyzed  with  a  principal  components 
analysis  to  determine  if  fewer  components  might  describe  the  spectra  of  these  41 
signals.  The  variance-covariance  matrix  was  used  in  this  analysis  to  preserve 
spectral  levels  in  the  bands.  Four  factors  which  accounted  for  85%  of  the  variance 
were  retained.  The  solution  was  rotated  with  a  varimax  rotation  which  reduced  the 
variance  explained  by  the  first  component  from  32%  to  29%.  The  factor  loadings 
are  shown  in  Table  2.  The  rotated  factor  pattern  showed  that  upper  bands  (>  3150 
Hz  center  frequency)  load  on  the  first  factor  (AFI),  higher  middle  bands  (1000  Hz  to 
2500  Hz  center  frequency)  on  the  second  (AF2),  low  bands  (200  Hz  and  31 5  Hz 
center  frequency)  on  the  third  factor  (AF3)  and  lower  middle  bands  (400  Hz  to  800 
Hz  center  frequency)  on  the  fourth  factor  (AF4).  Thus  the  average  spectrum  for 
these  sounds  is  described  by  factors  representing  these  four  frequency  regions. 
Factor  scores  were  obtained  for  use  in  later  analyses.  These  factor  scores 
especially  AFI,  correlated  significantly  with  other  acoustic  measures  of  the 
frequency  spectrum  (e.g.,  AFI  correlated  with  the  mean  frequency  r=  .66,  p  < 

.0001,  the  second  moment,  r=  -.85,  p<  .0001,  the  skewness,  -.65,  p<  .0001, 
and  the  kurtosis,  r=  .57,  p  <  .0001  ,of  the  FFT  spectrum)  but  not  with  measures  that 
are  unrelated  to  frequency  such  as  the  power  or  peak  magnitude. 

Acoustic  Factors  in  Identification  Time,  Uncertainty,  and  Accuracy 

Only  one  acoustic  variable  correlated  significantly  with  LMIT,  the  magnitude 
of  the  maximum  FFT  component  in  the  spectrum  (£  =  -.40,  q  <  .009).  Two  acoustic 
variables  correlated  significantly  with  H^^:  1 )  the  magnitude  of  the  maximum  FFT 
component  in  the  spectrum  ,r=  -.33,  p  =  .03;  and  2)  the  kurtosis  of  the  FFT  spectral 
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distribution,  r=  .37,  p  =  .02.  Two  acoustic  variables  correlated  with  accuracy,  the 
kurtcsis  of  the  FFT  distribution,  r-  -.41,  p<  .007,  and  AF2,  r=  -.38,  p  <  .02. 

However,  these  spectral  attributes  account  for  little  of  the  variance  in  LMIT, 
and  accuracy,  and  considering  the  number  of  correlations  that  were  examined, 
probably  represent  Type  I  errors. 

Accuracy  and  Identification  Time 

Correlation  of  accuracy  and  LMIT  was  significant ,  r=  -.72,  p  <  .0001 ,  but 
less  than  the  correlation  between  causal  uncertainty  and  LMIT  ,r=  .89,  p  <  .0001 . 
The  direction  of  this  relationship  is  opposite  to  what  would  be  expected  from 
models  of  speed-accuracy  tradeoff,  which  assume  that  "average  correct  reaction 
time  is  inversely  related  to  error  rate"  (Pachella,  1974,  p.  62). 

Perceptual-Cognitive  Ratings 

The  null  hypothesis  that  the  data  were  from  a  normal  distribution  was 
rejected  for  only  one  of  the  23  scales,  the  identifiability  of  the  sound  (Shapiro-Wilk 
statistic  W=  .94,  p  =  .047).  The  distribution  on  this  scale  was  bimqdal  suggesting 
that  the  set  of  sounds  were  heard  as  either  identifiable  or  not. 

The  nature  of  the  41  sounds  is  revealed  in  descriptive  statistics  of  the 
ratings.  The  highest  average  rating  was  for  clarity  (3.81 ),  and  the  lowest  was  for 
the  number  of  sounds  that  were  similar  (2.51).  The  highest  variability  was  for  the 
identifiability  of  the  sounds  {SD  =  .94)  and  the  lowest  was  for  the  necessity  of 
hearing  the  sound  within  a  sequence  of  sounds  in  order  to  identify  it  (SD  =  .37). 

Significant  relationships  were  found  between  perceptual-cognitive  ratings 
and  acoustic  measures.  Power  was  correlated  with  loudness  (r=  .49,  p  <  .001), 
and  with  the  ratings  of  hardness,  of  angularity,  of  sharpness,  of  tenseness,  of 
unpleasantness  and  of  compactness  ( .33  <  r<  .39,  p  <  .05).  The  relaxed/tense 
rating  of  the  sound  correlated  with  the  second  moment  of  the  spectrum  {r=  -.48,  p  = 
.001),  with  the  kurtosis  of  the  spectrum  (r=  .39,  p  =  .01),  with  the  average 
magnitude  of  the  spectrum  (r^  .63,  p  =  .0001),  and  with  AF1  and  AF2  representing 
octave  bands  above  1000  Hz.  The  highest  correlation  between  the  relaxed/tense 
rating  and  an  octave  band  measure  was  with  the  band  centered  at  2500  Hz  (r^ 

.63,  p  >  .0001).  The  correlations  between  relaxed/tense  rating  and  octave  band 
measures  dropped  off  in  each  direction  from  2500  Hz.  The  dull/sharp  rating 
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correlated  with  AF1  and  AF2  (r=  .48,  .51,  p<  .001).  Besides  power,  loudness 
correlated  with  duration  (r=  .38,  p  =  .01),  magnitude  of  the  maximum  spectral 
component  (r=  .60,  p  =  .0001)  with  AF2  (r  =  .60,  c  =  .0001),  and  with  the  octave 
bands  that  compose  AF2. 

Overall  ratings  of  the  ease  in  identifying  the  cause  were  highly  correlated 
both  with  ratings  assessing  the  action  of  the  cause  and  with  ratings  assessing  the 
agent  of  the  cause.  The  ease  of  forming  a  mental  picture  of  the  cause  was 
significantly  and  simi'arly  correlated  both  with  the  ease  in  forming  a  mental  picture 
of  the  agent  and  with  the  ease  in  forming  a  mental  picture  of  the  action  {r=  .98,  p  < 
.0001  for  both  correlations).  The  ease  in  describing  the  event  with  words  was 
correlated  both  with  ease  in  describing  the  agent  and  with  ease  in  describing  the 
action  (r=  .91 ,  p  <  .0001  for  both  correlations). 

The  ratings  were  analyzed  using  a  principal  components  analysis  to 
determine  if  fewer  components  would  account  for  the  variability  in  the  ratings.  The 
ratings  specific  to  the  action  and  agent  just  discussed  were  not  used  in  this 
analysis.  Three  factors  which  accounted  for  87%  of  the  variance  in  the 
eigenvalues  were  retained.  The  first  two  factors  alone  accounted  for  80%  of  the 
variance.  The  unrotated  solution  was  interpretable,  and  gave  results  similar  to  a 
rotated  solution  using  the  varimax  rotation.  But  the  rotated  solution  improved  the 
interpretation  of  the  factor  loadings  somewhat,  and  only  reduced  the  amount  of 
variance  explained  by  the  first  factor  from  39%  to  37%.  Factor  loadings  are  shown 
in  Table  3. 

The  first  factor  (PCI)  is  composed  of  ratings  which  are  all  highly  correlated 
(p  <  .0001)  with  the  rated  identifiability  of  the  sound.  These  rating  scales  and  their 
correlations  with  identifiability  include  the  ease  with  which  a  mental  picture  is 
formed  of  the  sound  (r  =  .99),  the  familiarity  of  the  sound  {i  =  .96),  identifiability  of 
the  sound  when  presented  in  isolation  {r=  .94),  the  similarity  of  the  sound  to  a 
mental  stereotype  (r=  .90),  the  ease  in  using  words  to  describe  the  sound  (r=  .88), 
and  the  clarity  of  the  sound  (r=  .88). 

The  second  factor  (PF2)  is  composed  of  ratings  of  sound  quality.  Rating 
scales  which  load  high  on  this  factor  include  relaxed/tense,  round/angular, 
dull/sharp,  pleasant/unpleasant,  and  loudness.  Two  of  these  ratings- 
round/angular  and  loudness-correlated  significantly  with  identifiability,  but  the 
correlations  were  low  (r=  .31 ,  p  =.05). 

The  third  perceptual-cognitive  factor  {PF3)  is  composed  of  ratings  of  the 
number  of  sounds  in  the  same  category,  the  number  of  similar  sounds,  and  the 
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number  of  events  which  could  cause  the  sound.  Together  these  three  ratings 
suggest  that  PC3  is  a  measure  of  sound  uniqueness. 

The  rating  scale  for  the  number  of  events  which  could  cause  the  sound  was 
intended  to  measure  causal  uncertainty,  but  was  poorly  designed  to  achieve  this 
purpose.  Instead,  it  tapped  the  uniqueness  of  the  sound.  It  was  expected  that  this 
scale  would  relate  to  PCI  because  of  the  high  correlation  between  PC1  and  . 
However,  the  scale  loaded  highly  on  PCS  instead  of  PC1  because  it  measures  a 
different  aspect  of  causal  uncertainty  .  The  scale  took  the  following  form: 

How  many  events  can  you  think  of  which  could  have  caused  this  sound? 

_1 _ 2_  _3_  __4_  _5_ 

not  very  very 

many  many 

Note  that  a  listener  could  use  either  endpoint  for  a  sound  that  is  difficult  to 
identify.  If  the  sound  is  difficult  to  identify  because  the  person  is  unfamiliar  with  the 
sound  or  there  is  insufficient  acoustic  information  for  identification,  then  a  response 
of  "not  very  many"  would  be  appropriate.  On  the  other  hand,  if  the  sound  is  difficult 
to  identify  because  many  events  could  produce  it,  then  the  other  end  of  the  scale 
would  be  used.  Thus  this  scale  assessed  whether  a  sound  is  associated  with  few 
or  many  events.  It  correlated  weakly  with  rated  identifiability  and  in  a  direction 
opposite  to  what  would  be  expected  if  the  scale  was  confounded  with  identifiability 
{r=  .31 ,  p  =  .05).  A  second  aspect  of  this  scale  deserving  discussion  is  the  use  of 
the  word  "event"  as  a  cause.  This  could  have  focused  the  listener's  thoughts  on  the 
occassions  in  which  the  sound  occurs,  rather  than  the  agents  and  actions  that 
actually  produce  the  acoustics  of  the  sound.  This,  together  with  the  meaning  of  the 
other  two  ratings  which  loaded  high  on  PC3,  would  suggest  that  a  "unique"  sound 
is  one  which  has  few  similar  sounds  in  the  same  category  and  which  rarely  occurs. 

Perceptual-Cognitive  and  Acoustic  Factors  in  Identification 

One  of  the  most  important  questions  in  analyzing  the  identification  of  these 
41  sounds  is  the  relationship  between  acoustic  attributes,  perceptual-cognitive 
judgments,  and  the  identification  of  the  sound.  Multiple  regression  analysis  was 
used  to  find  multiple  correlates  of  identification  performance  such  as  identification 
time,  identification  accuracy,  and  perceived  identifiability.  Stepwise  multiple 
regression  was  performed  with  the  dependent  variables  including  H^^,  the  factor 
scores  from  the  octave  band  measures,  the  factor  scores  from  the  perceptual- 
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cognitive  ratings,  and  other  acoustic  measures.  With  LMIT  as  a  dependent 
variable,  the  independent  variables  that  produced  significant  (p  <  .05)  increments 
in  fl^were  and  PCI  (identifiabllity).  The  with  these  two  independents 
was  .85,  with  alone,  R  ^  is  .79.  No  single  variable  correlates  as  highly  with 
LMIT  as  (and  only  one  rating,  the  similarity  of  the  sound  to  a  mental  stereotype, 
correlates  as  highly  with  LMIT  as 

With  accuracy  as  a  dependent  variable,  independent  variables  that 
produced  significant  increments  in  R^were  PC1,  and  the  peak  amplitude  in 
the  wave.  R^  with  these  three  variables  was  .67.  Each  of  the  variables  PCI  and 
peak  amplitude  added  about  5%  to  R  ^ .  When  the  dependent  variable  was  the 
rated  identifiabllity  of  the  sound,  the  independent  variables  that  produced 
significant  increments  in  R  ^  were  PC1 ,  the  familiarity  with  the  sound  event  (not 
familiarity  with  the  sound  itself,  which  is  included  in  PC1),  and  the  peak  amplitude 
in  the  wave.  R  ^  was  .97  with  these  independents.  However,  the  increase  in  R  ^ 
after  PC1  was  only  1%.  Taken  together  with  the  previous  results,  performance 
measures  of  identification  such  as  response  time,  accuracy  and  perceived 
identifiabllity  are  related  to  causal  uncertainty-as  quantified  in  values-to 
perceptual-cognitive  judgments  of  the  sound,  and  for  accuracy,  to  the  peak 
amplude  in  the  wave. 

Cluster  Analysis  of  Sounds 

The  listing  in  Table  1  is  sorted  by  increasing  Hcu  and  a  casual  scan  of  the 
listing  suggests  that  there  are  categories  of  sounds  that  vary  in  Hcu  and  in  MRT. 

For  example,  several  of  the  sounds  that  are  low  in  Hcu  and  MRT  are  signalling 
sounds  such  as  telephone,  car  horn,  and  doorbell.  Furthermore,  most  of  the  water 
sounds  such  as  drip,  bubbling,  oar  rowing,  and  flush  are  in  the  lower  half  of  the 
listing  of  Hcu  and  LMRT.  This  suggests  two  categories  of  sounds,  signalling  and 
water,  which  have  similar  uncertainities  and  identifications  times  within  the 
category. 

There  has  been  virtually  no  research  about  the  categories  that  listeners 
might  use  in  perceiving  everyday  sound,  let  alone  the  basis  for  these  categories.  In 
order  to  investigate  category  structure  in  the  41  sounds  used  in  this  study,  two 
types  of  cluster  analyses  of  the  sounds  were  conducted.  The  first  analysis  was 
intended  to  determine  whether  the  perceptual  and  cognitive  ratings  of  the  sounds 
would  produce  interpretable  clusters  of  the  sounds.  If  this  were  the  case,  then  the 
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cluster  structure  might  reflect  knowledge  about  sounds  and  form  the  basis  of  the 
ratings.  Accordingly,  factor  scores  for  PF1,  PF2  and  PF3  were  used  in  a 
hierarchical  cluster  analysis. 

The  second  analysis  was  designed  to  determine  how  the  sounds  would 
clustei  on  the  basis  of  identification  responses,  and  to  determine  whether  there 
were  sounds  that  might  be  confused  as  evidenced  by  similar  identification 
responses.  Accordingly,  the  hierarchical  cluster  analysis  was  based  upon  an  index 
of  causal  similarity  calculated  from  a  confusion  matrix  of  identification  responses. 
The  confusion  matrix  was  based  upon  an  analysis  of  the  similarity  of  the  events 
used  to  identify  pairs  of  the  sounds. 

Perceptual/cognitive  ratings  clustering.  To  discover  how  the  sounds  would 
cluster  based  upon  perceptual/cognitive  ratings,  a  complete  linkage  ciuster 
analysis  was  done  with  PF1 ,  PF2,  and  PF3  as  the  clustering  variables.  Factor 
scores  for  these  variables  were  used  directly  except  for  selected  changes  in  sign  to 
improve  the  interpretation  of  cluster  plots.  There  were  four  major  clusters  as  shown 
by  the  tree  diagram  in  Figure  1.  This  tree  diagram  and  others  to  follow  indicates 
clustered  components  with  Xs  in  the  column  beneath  the  sound(s)  that  are  in  the 
cluster.  The  distance  between  the  clusters  is  indicated  in  the  margin.  Interpretation 
of  the  four  clusters  is  aided  by  plotting  the  sounds  in  3-D  space  (Figures  2-5)  with 
the  dimensions  being  the  three  variables  used  in  the  cluster  analysis,  Identifiability 
(PF1),  sound  Quality  {Pf  2),  and  sound  Uniqueness  (PF3). 

The  first  cluster  consists  mostly  of  sounds  that  are  produced  with  water  (drip, 
splash,  bubble,  flush)  or  in  a  water  context  (boat  whistle,  foghorn).  Additional 
sounds  in  this  cluster  include  the  lighter  and  clock  ticking.  However,  one  of  these, 
the  lighter  sound,  is  at  the  edge  of  the  ciuster  and  is  the  lowest  of  the  cluster  on 
identifiability.  Most  of  the  sounds  have  negative  sound  quality  scores  (i.e.,  ratings 
of  soft,  round,  dull,  relaxed  and  pleasant),  and  three  sounds  (lighter,  flush,  and 
foghorn)  have  the  highest  uniqueness  scores  of  all  41  sounds.  High  uniqueness  is 
related  to  ratings  that  there  are  few  sounds  in  the  same  category,  few  similar 
sounds,  and  few  events  could  be  thought  of  which  could  cause  the  sound.  These 
three  sounds  with  high  uniqueness  scores,  together  with  the  boat  whistle,  form  a 
sub-cluster.  The  other  sub-cluster  includes  three  water  sounds  and  the  clock  tick. 

The  second  cluster  consists  of  several  signaling  sounds  (telephone, 
doorbell,  bugle,  subhorn,  and  carhorn),  and  sounds  that  connote  danger  (fireworks, 
auto  rifle,  and  power  saw).  Two  of  the  non-signaling  sounds,  fireworks  and 
powersaw,  stand  at  the  edge  of  the  cluster  and  are  low  in  identifiability  compared  to 
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the  rest  of  the  sounds  in  the  cluster.  These  sounds  have  high  identifiability  scores 
(Figure  3)  and  positive  sound  quality  scores  (i.e.,  ratings  of  hard,  angular,  sharp, 
tense,  and  unpleasant).  Three  sub-clusters  are  evident.  One  includes  signalling 
sounds  (doorbell,  telephone  ring  and  bugle),  one  includes  the  fireworks,  subhorn, 
and  powersaw  sounds,  and  the  third  includes  the  autohfle  and  carhorn. 

The  third  cluster  includes  sounds  that  have  negative  identifiability  scores 
(Figure  4),  meaning  they  were  rated  as  difficult  to  identify.  It  includes  several  door 
sounds  Gail  door  closing,  door  opening,  electric  buzzer  (used  on  some  doors  to 
remotely  open  the  door),  and  key  inserted  in  lock),  three  engine  sounds  (car 
backfire,  car  ignition,  and  lawnmower),  a  sound  that  was  sometimes  identified  as 
an  engine  sound  (tree  saw)  and  two  other  sounds  (bacon  frying  and  rifle  shot 
outdoors).  Within  the  cluster,  these  sounds  have  somewhat  different  acoustics  and 
generally  there  is  a  combination  of  negative  and  positive  sound  quality  scores, 
rather  than  a  dominance  by  one  or  the  other  as  in  clusters  1  and  2.  Several  of  the 
sounds  have  perceptible  echoes  such  as  the  jail  door,  the  outdoor  rifle  shot,  and 
the  car  backfire.  Four  sub-clusters  comprise  this  cluster,  but  the  distances  between 
these  sub-clusters  is  small  compared  to  the  distance  between  this  major  cluster 
and  the  other  major  clusters.  Thus  this  major  cluster  is  the  most  homogeneous  of 
the  four. 

The  fourth  cluster  includes  most  of  the  non-signalling  and  non-water  sounds 
that  have  two  or  more  transient  components  (light  switch,  stapler,  footstep, 
clogstep,  phone  hang,  file  cabinet,  door  knock,  hammer,  corkpop,  and  door  close). 
It  also  includes  two  bell  sounds,  the  touchtone  sound,  and  several  single  transient 
sounds  (tree  chop,  and  rifle  indoors).  Most  of  the  transient  sounds  in  this  cluster 
have  sharp  attacks  and  most  have  negative  uniqueness  scores  but  vary 
moderately  in  identifiability  (Figure  5).  This  result  is  consistent  with  the  conclusion 
that  uniqueness  is  not  confounded  with  identifiability.  There  are  two  sub-clusters 
within  this  cluster,  and  each  sub-cluster  is  further  divided  into  two  clusters. 

In  summary,  clustering  of  the  sounds  using  scores  on  three 
perceptual/cognitive  factors  produces  four  clusters,  identified  by  the  majority 
members  as  follows:  a  water  cluster,  a  signal  sound  cluster,  a  cluster  of  sounds 
difficult  to  identify,  and  a  cluster  of  multiple  transient  sounds.  At  a  higher  level,  the 
water  and  signal  clusters  combine,  and  the  multiple  transient  and  poor 
identifiability  clusters  combine,  probably  on  the  basis  of  identifiability  scores 
because  in  general  the  signal  and  water  sounds  have  lower  Hcu  values.. 
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Identification  response  clustering.  In  identificaition  research,  confusion 
matrixes  are  frequently  used  to  discover  perceptual  structure.  Often,  the  goal  is  to 
discover  the  psychophysical  dimensions  that  form  the  basis  for  perceptual 
judgments.  Until  now,  the  analysis  of  the  41  sounds  has  been  based  upon 
measures  of  uncertainty,  identification  time,  acoustic  parameters  calculated  from 
the  sounds,  and  the  perceptual/cognitive  ratings.  However,  these  data  do  not 
address  the  issue  of  identification  confusions  within  the  set  of  41  sounds.  They 
certainly  cannot  form  the  basis  for  a  confusion  matrix  of  identifications.  However, 
identification  responses  can  be  used  to  produce  a  confusion  matrix.  This  in  turn 
can  be  used  to  calculate  an  index  of  identification  confusion  for  pairs  of  sounds, 
which  can  serve  as  a  distance  measure  in  a  cluster  analysis.  A  cluster  analysis  of 
these  distances  would  suggest  the  alternative  choices  a  listener  might  consider  in 
making  an  identification  response. 

In  order  to  develop  a  confusion  matrix,  the  identification  responses  for  the  41 
sounds  were  combined  and  sorted  by  similar  response  and  by  sound.  Altogether, 
1795  identification  responses  were  sorted  into  categories  of  events  using  the 
criteria  developed  by  Balias  and  Sliwinski  to  sort  the  identification  responses  for  a 
single  sound.  A  confusion  matrix  was  generated  by  counting  the  number  of  event 
categories  that  pairs  of  sounds  had  In  common.  Using  only  event  categories  that 
occurred  for  at  least  two  sounds  resulted  in  a  total  of  66  categories.  A  data  matrix 
was  formed  of  66  event  categories  by  the  41  sounds,  with  the  entries  a  binary 
notation  of  the  occurrence  of  an  event  category  used  to  identify  a  sound.  Distance 
between  sounds  was  computed  from  this  matrix  as  follows 

Dji=  distance  between  sound  /  and  sound  j 
e-j  =  number  of  events  cited  in  common  for  sounds  ;  and  j. 

These  distance  data  were  used  in  a  cluster  analysis.  Two  solutions  were 
informative,  one  based  upon  single  linkage  or  the  minimum  method  (tree  diagram 
in  Figure  6),  and  one  based  upon  complete  linkage  or  the  maximum  method  (tree 
diagram  in  Figure  7).  The  single  linkage  clustering  produces  fewer  clusters, 
irregular  in  shape  whereas  the  complete  linkage  clustering  produces  more 
clusters,  most  of  which  are  compact  and  similar  in  shape.  In  both  solutions, 
distance  between  clusters  will  indicate  identification  confusion  inversely.  There  is 
more  confusion  with  smaller  distances  between  the  cluster.  There  are  similarities 
in  the  two  solutions.  Both  produce  two  large  clusters  of  the  sounds,  one  composed 
mostly  of  impact  sounds,  and  the  other  composed  of  water,  signalling,  and 
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continuous  sounds.  In  both  solutions,  the  first  four  clusters  formed  are  identical. 
However,  with  some  exceptions,  the  complete  linkage  algorithm  continues  to  form 
cluster  pairs  whereas  the  single  linkage  algorithm  joins  sounds  to  the  first  four 
clusters.  The  single  linkage  solution  is  therefore  useful  in  seeing  the  hierarchical 
nature  of  sound  identification  confusions,  whereas  the  complete  linkage  solution  is 
useful  in  finding  sound  pair  confusions.  The  reason  for  this  is  based  upon 
differences  in  the  algorithm  for  the  two  solutions.  In  single  linkage,  distance 
between  clusters  is  based  upon  the  minimum  distance  between  any  pair  of 
observations.  Therefore,  an  existing  cluster  can  pick  up  additional  members  even 
if  it  has  existing  members  that  are  very  different  from  the  new  addition.  In  complete 
linkage,  distance  is  based  upon  the  maximum  distance  between  any  pair  of 
observations.  Therefore,  additional  members  will  be  compared  to  the  most  distant 
member  of  existing  clusters.  Clustering  is  biased  toward  the  formation  of  paired 
clusters. 

In  the  single  linkage  solution,  the  impact  sound  cluster  is  composed  of  three 
sub-clusters;  1)corkpop,  tree  chop,  file  cabinet;  2)  door  open,  door  close;  and  3) 
door  knock,  hammer)  The  remaining  impact  sounds  are  joined  to  the  cluster 
formed  by  these  three  sub-clusters.  The  water  and  signalling  cluster  consists  of  a 
sub-cluster  of  water  sounds  (bubble,  splash,  dhp),  and  water-related  sounds  (flush 
and  bacon  frying,  which  sounds  like  rain)  joined  to  this  sub-cluster.  This  is  the  only 
sub-cluster  within  the  water-signal  cluster  that  has  components  as  close  as  the 
three  sub-clusters  in  the  impact  cluster.  Three  other  sub-clusters  are  evident  but 
the  distances  are  greater,  meaning  that  identification  confusion  is  less.  These 
include  a  cluster  of  signalling  sounds  three  of  which  are  produced  by  bells 
(telephone  ring,  doorbell,  church  bell,  and  touchtone),  a  cluster  of  horns  (car  horn, 
fog  horn),  and  a  cluster  of  two  engine  sounds  (lawnmower-car  ignition).  The  rest  of 
the  sounds  in  this  major  cluster  are  joined  to  these  four  sub-clusters. 

The  complete  linkage  solution  is  characterized  by  smaller  clusters  within  the 
two  major  clusters.  The  water-signalling  sub-cluster  is  composed  of  clusters  of 
sound  pairs  including  lighter  and  tree  saw.  subhorn  and  powersaw,  lawnmower 
and  car  ignition,  telephone  ring  and  doorbell,  touchtone  and  church  bell,  drip  and 
splash,  bubbling  and  bacon  frying,  flush  and  bell  buoy.  One  sub-cluster  is  best 
characterized  as  a  triplet  of  the  foghorn,  car  horn,  and  bugle.  Most  of  these  pairs 
have  similar  acoustic  signatures.  The  impact  sound  cluster  consists  of  sound  pairs 
and  triplets.  The  pairs  include  hammer  and  door  knock,  door  open  and  door  close, 
car  backfire  and  auto  rifle.  The  triplets  include  stapler,  fireworks,  and  rifle  outdoors, 
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and  a  triplet  of  cork  pop,  tree  chop,  and  file  cabinet.  Other  sounds  are  joined  to  the 
pairs  or  triplets.  One  .sub-cluster  that  is  clearly  evident  in  the  tree  is  the  four  impact 
sounds  resulting  from  the  inclusion  of  clog  step  and  footstep  to  the  hammer  and 
door  knock  pair. 

In  summary,  the  clustering  of  identification  responses  using  a  causal 
similarity  index  produces  clusters  that  clearly  have  similar  acoustic  signatures. 
Different  clustering  criteria  result  in  similar  solutions  with  two  major  clusters 
emerging,  one  including  most  of  the  impact  sounds,  the  other  the  water,  signalling 
and  continuous  sounds.  Minor  clusters  consist  of  sounds  that  have  similar  acoustic 
signatures.  As  a  whole,  the  set  of  sounds  includes  a  number  of  pairs  that  are 
confused,  and  a  small  number  of  larger  clusters  of  sounds  that  are  confused. 

Comparing  the  clustering  of  the  sounds  on  perceptual/cognitive  scores  with 
the  clustering  on  identification  response  similarity  shows  similarities  at  the  highest 
level,  but  differences  at  lower  levels.  Overall,  both  clustering  approaches 
presented  groupings  of  water  sounds  and  impact  sounds.  The  factor  score 
clustering  produced  solutions  that  revealed  similarities  in  how  the  sounds  are 
perceived,  the  identification  response  clustering  revealed  identification  confusions. 
In  some  respects,  the  two  clustering  approaches  produced  inverse  solutions.  For 
example,  the  factor  score  clustering  produced  a  cluster  of  sounds  that  are 
identifiable,  composed  mostly  of  signal  sounds.  These  same  sounds  were  not 
clustered  in  the  identification  response  clustering  until  the  distance  between 
clusters  was  increased.  Thus  although  signal  sounds  have  similar  perceptual 
properties,  they  are  not  necessarily  confused  in  identifications,  but  in  fact  are  quite 
identifiable.  Both  approaches  produced  a  water  cluster,  and  a  cluster  of  impact 
sounds,  suggesting  that  water  sounds  and  impact  sounds  have  properties  that 
unite  them  in  a  perceptual/cognitive  domain  and  also  make  them  confusable 
sounds. 


Discussion 

The  studies  of  these  41  sounds  have  produced  the  following  results 
relevant  to  understanding  the  identification  of  isolated  everyday  sounds,  subject  to 
the  limitations  of  the  stimulus  set: 

1 .  The  time  to  identify  a  brief  everyday  sound  increases  as  increases 
and  as  the  perceived  identifiabiiity  of  the  sound  decreases. 
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2.  Perceived  identifiability  is  related  to  the  ease  with  which  a  mental  picture 
is  formed  of  the  sound,  the  familiarity  of  the  sound,  the  ease  in  identifying  the  sound 
in  isolation,  the  similarity  of  the  sound  to  a  mental  stereotype,  the  ease  in  using 
words  to  describe  the  sound,  and  the  clarity  of  the  sound.  Listeners  did  not 
distinguish  between  their  ability  to  imagine  or  describe  the  agent  and  their  ability  to 
imagine  or  describe  the  action  involved  in  the  cause  of  the  sound. 

3.  Spectral  acoustic  variables  are  relatively  minor  factors  in  the  time  to 
identify  a  sound,  in  ,  and  in  perceived  identifiability  of  a  sound.  They  are 
related  to  perceptual-cognitive  judgments  of  the  sound  quality.  The  weak 
relationship  between  and  spectral  magnitude  and  kurtosis  should  be 
interpreted  with  caution,  because  Balias  and  Barnes  (1988)  found  that  in  a  different 
set  of  sounds,  the  average  frequency  in  the  spectral  distribution  was  inversely 
related  to 

4.  Clustering  of  the  sounds  using  scores  on  three  perceptual/cognitive 
factors  produces  four  clusters,  a  water  cluster,  a  signal  sound  cluster,  a  cluster  of 
sounds  difficult  to  identify,  and  a  cluster  of  multiple  transient  sounds.  Clustering 
sounds  on  the  basis  of  a  causal  similarity  index  produces  two  major  clusters  one 
including  most  of  the  impact  sounds,  the  other  water,  signalling  and  continuous 
sounds.  Small  clusters  were  based  upon  pairs  of  sounds  that  seemed  to  have 
similar  acoustic  signatures,  but  this  similarity  is  not  captured  by  similarity  of  1/3 
octave  profiles. 

The  results  show  that  LMIT  is  estimated  better  with  than  by  the  acoustic 
measures  computed  for  these  sounds.  However,  there  are  well  known  limitations 
of  information  measures  (Wickens,  1984).  One  of  the  limitations  of  the  Hick-Hyman 
law  is  that  it  does  not  account  for  the  effect  of  non-information  variables  (subset 
familiarity,  stimulus  discriminability,  repetition  effect,  stimulus-response 
compatibility,  and  practice)  on  response  time.  However,  the  sounds  were 
discriminabie  from  one  another  based  upon  the  ABX  results,  and  were  presented 
only  once  to  the  listeners  in  the  Balias  and  Stiwinski  study  in  random  order.  Thus 
there  was  no  opportunity  for  these  effects  to  develop.  If  the  relationship  between 
response  time  and  H^^  is  due  to  discriminability  effects,  it  would  not  be 
discriminability  within  the  set  of  41  sounds.  Instead,  the  relationship  would  be  due 
to  the  discriminability  of  sounds  representing  alternative  causes  for  the  sounds 
actually  heard.  For  example,  the  increased  time  to  identify  the  sound  of  a  door 
dosing,  a  very  familiar  event,  could  be  due  to  response  competition  from 
reasonable,  alternative  causes  for  this  sound.  This  sets  up  a  classical  choice 
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response  l.me  task,  where  the  number  of  choices  are  determined  by  the  number  of 
reasonable  alternative  causes  for  a  sound.  These  alternatives  were  not  presented 
to  the  listeners,  and  were  not  represented  in  the  set  (except  for  two  sounds  which 
will  be  discussed  shortly). 

It  is  possible  that  many  of  the  perceptual-cognitive  judgments,  and  even  the 
measure  of  may  be  redundant.  Cleariy,  there  is  redundancy  between  the 
rating  of  identifiability  and  many  of  the  other  ratings.  These  ratings  simply  amplify 
on  what  is  meant  by  an  identifiable  sound,  it  is  one  which  generates  a  mental 
picture,  can  be  described  easily  with  words,  is  similar  to  a  stereotype,  can  be 
identified  when  presented  in  isolation,  and  is  clear.  Clarity  probably  refers  to  the 
lack  of  spectral  complexity  because  of  the  significant  correlations  between 
identifiability  and  both  the  magnitude  of  the  maximum  value  in  the  spectrum  and 
the  kurtosis  of  the  spectral  distribution. 

Hqu  be  redundant  with  the  perceptual-cognitive  judgments,  because  it 
correlates  with  PC1  (r=  .79,  p  <  .0001)  but  not  with  PC3  (r=  .04,  p  =  .82),  which 
represents  sound  uniqueness.  It  is  calculated  from  the  aggregated  responses  of  a 
group  of  listeners,  is  equivalent  to  response  equivocation,  and  is  properly 
considered  a  response  measure.  Thus  one  would  expect  it  to  be  related  to 
judgments  of  the  ease  in  describing  the  sound  with  words,  and  forming  a  mental 
image  of  the  sound,  two  components  of  PCI .  It  is  also  correlated  with  the  rating  of 
the  stereotypy  of  the  sound  (r=  .85,  p  <  .0001),  and  in  fact,  this  property  of  a  sound 
may  be  the  most  important  component  in  identifiability.  Stereotypy  would  certainly 
be  responsible  for  the  quick  identification  and  high  identifiability  ratings  of  synthetic 
signalling  sounds  such  as  the  telephone  ring,  the  doorbell,  and  car  horn.  A  strong 
stereotype  would  exist  for  these  sounds.  But  sounds  with  lower  values  also 
include  water  sounds,  which  cannot  be  restricted  by  design  as  can  the  synthetic 
sounds.  Stereotypy  can  account  for  the  identifiability  of  synthetic  sounds,  but  can  it 
account  for  the  identifiability  of  natural  sounds? 

The  results  suggest  that  identification  is  largely  based  upon  reference  to  a 
stereotype  for  the  sound.  A  stereotype  might  include  multiple  attributes  and  further 
research  could  pursue  the  nature  of  the  stereotype.  In  the  absence  of  a  strong 
stereotype,  alternative  causes  must  be  considered.  These  alternatives  establish 
alternative  choices,  and  the  inability  to  discriminate  between  these  alternatives 
would  increase  identification  time  in  the  same  manner  that  stimulus 
indiscriminability  increases  response  time  in  a  typical  choice  response  task.  On 
the  basis  of  response  equivocation,  some  sounds  have  few  if  any  alternatives. 
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others  have  many.  The  possibility  (or  lack  thereof)  of  alternatives  may  come  from 
the  lack  of  a  stereotype,  similarity  in  acoustics,  limitations  of  perception,  or  known 
variability  in  the  sound  of  an  event.  Any  account  of  everyday  sound  perception,  if  it 
is  to  address  the  perception  beyond  a  limited  domain  of  sounds,  must  address  the 
possibility  of  alternative  causes  and  how  they  are  considered  by  the  listener. 
Current  research  is  assessing  the  effect  of  context  (Balias  &  Mullins,  1989) 

The  results  of  these  experiments  are  subject  to  the  limitations  of  the  stimulus 
set  used  throughout  the  experiments.  Listeners  feit  that  the  sounds  were  clear, 
varied  in  identifiability,  and  that  presenting  them  in  isolation  did  not  diminish  the 
identifiability.  These  characteristics  are  what  one  would  want  in  a  set  of  isolated 
sounds  to  study  identification  processes.  But  the  findings  for  these  sounds  have 
been  found  with  other  sounds.  The  relationship  between  and  identification 
time  was  first  found  with  a  different  set  of  sounds  (Balias,  Sliwinski,  &  Harding, 
1976)  which  included  animal  vocalizations.  The  measure  of  is  consistent  for 
different  exemplars  of  the  same  sound  (Balias,  Dick,  &  Groshek,  1987)  implying 
that  the  results  are  not  limited  to  the  particular  exemplars  used  in  these  studies. 
Significant  correlations  have  been  found  between  and  rated  confidence  in 
identifying  a  sound  (Balias  &  Howard,  1987)  in  two  studies  that  used  two  sets  of 
sounds  different  from  the  sounds  used  here.  Finally,  several  studies  have  used 
sounds  longer  than  the  brief  duration  that  has  to  be  used  to  obtain  interpretable 
identification  response  times.  Results  of  these  studies  (Balias,  Dick,  &  Groshek, 
1987;  Balias  &  Howard,  1987)  are  consistent  with  the  general  findings  reported 
here. 

Although  the  general  relationships  between  performance  measures 
(identification  time,  causal  uncertainty,  and  perceived  identifiability)  and  measures 
made  on  the  41  stimuli  may  generalize  beyond  this  set  of  sounds,  the  clustering 
results  should  be  generalized  with  caution.  The  clustering  results  show  that 
categories  of  everyday  sounds  are  related  to  acoustic,  perceptual-cognitive,  and 
performance  variables.  But  the  categories  found  in  this  study,  especially  the  two 
major  categories  of  impact  sounds  and  water  sounds,  may  be  determined  by  the 
sounds  in  the  stimulus  set.  For  example,  there  were  not  many  friction  sounds  such 
as  sandpapering,  tires  squealing  and  metal  grinding.  Furthermore,  there  were  no 
wind  or  storm  sounds.  A  second  issuo  related  to  the  generalization  of  these 
categories  concerns  the  nature  of  the  categories.  They  have  only  been  defined 
here  by  a  listing  of  members,  and  by  relative  scores  on  three  perceptual-cognitive 
dimensions.  Important  questions  remain  about  the  external  and  internal  structure 
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of  everyday  sound  categories.  These  include  questions  about  taxonomic  structure, 
internal  attributes,  existence  and  definition  of  prototypes,  and  level  of  description 
(Rosch,1978). 
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Table  1 


Identification  Performance  Measures  For  Test  Sounds 


Sound _ 

MRT 

Heu 

1. 

Telephone  ring 

1253 

0.44 

2. 

Clock  ticking 

1592 

1.07 

3. 

Car  Horn 

1611 

0.75 

4. 

Doorbell 

1642 

0.58 

5. 

Automatic  rifle 

1666 

1.89 

6. 

Riverboat  whistle 

1751 

1.26 

7. 

Water  drip 

1831 

1.14 

8. 

Bell  buoy 

1912 

2.81 

9. 

Foghorn 

2135 

2.24 

10. 

Water  bubbling 

2325 

2.75 

11. 

Bugle  charge 

2356 

2.19 

12. 

Rifle  shot  indoors 

2371 

2.97 

13. 

Lawn  mower 

2596 

3.65 

14. 

Church  bell 

2614 

2.88 

15. 

Oar  rowing 

2745 

3.37 

16. 

Door  knock 

2779 

1.98 

17. 

Toilet  flush 

2779 

1.84 

18. 

Footsteps 

2823 

2.53 

19. 

Fireworks 

2926 

3.23 

20. 

Cigarette  lighter 

3210 

3.46 

21. 

Touch  tone  dial 

3305 

2.84 

22. 

Door  opening 

■  3335 

2.94 

23. 

Bacon  frying 

3422 

3.42 

24. 

Hammering 

3624 

3.13 

25. 

Sub  dive  horn 

3695 

3.51 

26. 

Walking  in  clogs 

3799 

2.23 

27. 

Car  ignition 

3802 

3.^7 

28. 

Wood  chop 

4071 

4.51 

29. 

Power  Saw 

4113 

4.45  • 

30. 

Key  in  lock 

4240 

3.67 

31. 

Cork  popping 

4296 

3.60 

32. 

File  cabinet  door 

4305 

3.34 

33. 

Door  closing 

4372 

2.90 

34. 

Car  backfire 

4610 

3.72 

35. 

Jail  door  closing 

5197 

3.96 

36. 

Rifleshot  outdoors 

5240 

3.88 

37. 

Light  switch 

6022 

4.40 

38. 

Stapler 

6055 

4.65 

39. 

Telephone  hangup 

6660 

4.78 

40. 

Tree  sawing 

6792 

4.72 

41. 

Electric  lock 

6823 

4.11 

Note.  MRT  «  mean  reaction  time(ms);  Hue  =  Median  uncertainty  values  for  three 
sorters 


27 


Table  2 


I 

Factor  Loadings  for  Four  Acoustic  Factors 


Center  Freq. 
(Hz) 


Factor  1  Factor  2  Factors  Factor  4 


200 

.06 

-.16 

.92 

-.06 

250 

.06 

.17 

.90 

-.02 

315 

-.04 

.03 

.86 

.40 

400 

-.15 

.16 

.62 

.64 

500 

-.13 

.54 

.38 

.60 

630 

.00 

.07 

.20 

.79 

800 

-.03 

.34 

-.20 

.77 

1000 

-.10 

.74 

.10 

.56 

1250 

-.04 

.82 

-.18 

.28 

1600 

.13 

.63 

.06 

.21 

2000 

.49 

.80 

.03 

-.08 

2500 

.56 

.75 

.18 

.03 

3150 

.82 

.47 

.10 

.02 

4000 

.89 

.34 

p 

1 

-.21 

5000 

.97 

.09 

.01 

-.13 

6300 

.98 

-.07 

-.06 

.02 

8000 

.95 

-.06 

.02 

.06 

28 


Table  3 


Factor  Loadings  for  Three  Perceptual-Cognitive  Factors 

Factor  1  Factor  2  Factor  3 


Ease  in  forming  a  mental  picture 

.97 

.09 

-.03 

Isolated  identifiability 

.95 

.01 

-.06 

Sound  familiarity 

.95 

.13 

-.11 

Similarity  to  mental  stereotype 

.91 

.20 

-.10 

Ease  in  describing  sound  with  words 

.89 

-.10 

-.28 

Clarity 

.89 

.01 

-.10 

Interesting/boring 

.78 

-.19 

-.00 

Relaxed/tense 

.05 

.97 

.08 

Soft/hard 

.02 

.95 

.19 

Round/angular 

.26 

.89 

.18 

Oull/sharp 

-.18 

.87 

.27 

Pleasant/uppieasant 

.28 

.86 

-.13 

Loud/bOft 

-.39 

.79 

.03 

Number  of  sounds  in  same  category 

-.15 

.16 

.92 

Number  of  similar  sounds 

.04 

.23 

.86 

Number  of  causal  events 

-.33 

.44 

.73 

Compact/scattered 

.14 

.38 

-.55 
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Name  of  sound 
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Figure  1.  Complete  linkage  cluster  analysis  of  PF1,  PF2,  and  PF3. 
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Table  4 


Factor  Scores  on  Three  Perceptual-Cognitive  Factors 
Sound  Identifiability  Quality  Uniqueness 


1.  Telephone  ring 

1 .8051 

0.3559 

0.7269 

2.  Clock  ticking 

0.8643 

-1.2319 

-0.1267 

3.  Car  Horn 

1.7320 

1.7817 

-0.1631 

4.  Doorbell 

1.8169 

-0.6765 

-0.0484 

5.  Automatic  rifle 

1 .4772 

2.3056 

0.7344 

6.  Rivorboat  whistle 

0.4979 

-0.6232 

0.8122 

7.  Water  drip 

1.1189 

-1.2313 

0.4162 

8.  Bell  buoy 

0.9738 

0.4416 

-0.7944 

9.  Foghorn 

0.8993 

-1.2254 

1 .5758 

10.  Water  bubbling 

0.5305 

-2.5203 

-0.3634 

1 1 .  Bugle  charge 

1 .8085 

-0.0901 

-0.6622 

1 2.  Rifle  shot  indoors 

-0.3993 

1.2374 

-1.4659 

13.  Lawn  mower 

-0.2245 

0.5973 

0.1640 

14.  Church  bell 

0.9158 

0.0325 

-1.4782 

15.  Oar  rowing 

0.7036 

-1.3898 

-0.2945 

1 6.  Door  knock 

0.4689 

0.3009 

-1.5785 

17.  Toilet  flush 

0.6423 

-1.3463 

1 .8883 

1 8.  Footsteps 

-0.4304 

-1.2338 

-1.4979 

19.  Fireworks 

0.7020 

1.7191 

0.7161 

20.  Cigarette  lighter 

-0.8224 

-1.4921 

2.0140 

21.  Touch  tone  dial 

0.3142 

-0.4103 

-0.4945 

22.  Door  opening 

-0.9378 

0.3070 

0.0268 

23.  Bacon  frying 

-0.8000 

0.3573 

0.5866 

24.  Hammering 

-0.1352 

0.1746 

-2.7178 

25.Sub  dive  horn 

0.5889 

0.8699 

1 .0630 

26.  Walking  in  clogs 

-0.7647 

-0.6098 

-0.1882 

27.  Car  ignition 

-1.3541 

-0.2430 

0.8155 

28.  Wood  chop 

-1.0564 

-0.7822 

-1 .5734 

29.  Power  Saw 

-0.2338 

1.4706 

0.8750 

30.  Door  latched 

-1.1821 

0.1294 

0.2596 

31.  Cork  popping 

-0.4326 

0.1654 

0.1650 

32.  File  cabinet  door 

-0.5370 

0.3032 

-1.1127 

33.  Door  closing 

-0.3843 

0.1365 

-0.4233 

34.  Car  backfire 

-1.0389 

0.7162 

-0.0332 

35.  Jail  door  closing 

-1.0553 

0.931 1 

0.6787 

36.  Rifleshot  outdoors 

-0.4431 

0.4987 

0.8053 

37.  Light  switch 

-0.7039 

-0.1286 

0.1627 

38.  Stapler 

-0.5568 

0.2221 

-0.1979 

39.  Telephone  hangup 

-1.3521 

-0.4752 

-0.6108 

40.  Tree  sawing 

-1.5771 

-0.0879 

0.5163 

41 .  Electric  lock 

-1.4382 

0.7439 

0.8227 

30-a 


Figure  2.  Cluster  1,  consisting  of  the  eight  sounds  in  the  cluster  on  the 
right  in  Figure  1 ,  plotted  on  the  three  dimensions  used  for  the  cluster  solution. 


Figure  5.  Cluster  4,  consisting  of  the  fifteen  sounds  in  the  first  cluster  on 
the  left  in  Figure  1,  plotted  on  the  three  dimensions  used  for  the  cluster  solution. 
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Figure  6.  Single  linkage  cluster  analysis  of  identification  confusion  index. 
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Figure  7.  Complete  linkage  cluster  analysis  of  identification  confusion  index. 
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Appendix  A 

Description  and  Source  of  41  Sounds 


Sound 

Description 

Source 
(Record, Vol, 
Side,  Band) 

1 .  Telephone 
ringing 

high-pitched  ringing 

SFX,5,1,6 

2.  Clock 

series  of  clicking  sounds, 
ticking  at  moderate  speed 

SE,2,B,10 

3.  Car  horn 

blasting,  honking  sound  of 
medium-pitched  horn 

SE,13,B,4 

4.  Doorbell 

two  separate  chimes  that  run 
together,  both  chimes  high- 
pitched,  first  chime  has 
higher  pitch  than  second 

CBS,3,1,16 

5.  Automatic 
rifle 

sporadic  fire, 4-5  shots 

SE,13,B,13 

6.  Riverboat 
whistle 

strong,  high-pitched  blast 

SE,13,A,15 

7.  Water 
dripping 

high-pitched  water  drip 

Recorded 

8.  Bellbuoy 

two  quick,  high-pitched  chimes, 
lapping  water  and  seagulls 
in  background 

AU,4,B,18 

9.  Foghorn 

one  blast  of  decreasing  pitch 

SE,13,A,13 

10.  Water 
bubbling 

continuous,  gurgling  sound 

AU.4.A,11 

1 1 .  Bugle 

notes  increasing  in  pitch 

AU,4,B,6 

12.  Rifleshot 

single  shot,no  echo 

SE,2,A,21 

indoors 
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13.  Lawn  mower 

loud,  continuous,  pulsating 
sound  of  a  motor 

SFX,1,1,16 

14.  Church-bell 
tolling 

echoing,  high-pitched  bell 

SE,2,A,8 

15.  Swish 

oar  being  rowed  in  water, 
sound  of  water  flowing  smoothly 

SFX,2 

16.  Knocking 
on  door 

hard  knocking  on  door 

CBS,2,2,11 

17.  Flush 

toilet  flushing, rushing  water 

CBS,1,2,17 

1 8.  Footsteps 

woman  walking  quickly  in  high  heels 

SE,13,B,4 

19.  Fireworks 

powerful  firecracker  exploding, 
explosive,  thundering  quality 

SFX,8,2,11 

20.  Cigarette 
lighter 

lighter  being  lighted, 
quick,  grinding,  high-pitched 
metallic  sound,  quick  hissing 

Recorded 

21.  Touch  tone 
telephone 

beeping  sounds  produced  by  touch 
lone  telephone,  beeps  are  at 
different  pitches 

SFX,5,1,10 

22.  Door 
opening 

door  being  opened.metallic 
lock  opening,  creaking 
of  hinges  in  background 

CBS,2,2,10 

23.  Bacon 
sizzling 

sounds  of  bubbling,frying  oil  in 
a  frying  pan 

AU,4,A,8 

24.  Hammering 

series  of  pounding  sounds, 
hammer  pounding  a  nail 

SFX,3,2,13 

25,  Submarine 
dive  horn 

quick  blast  of  increasing  and 
then  decreasing  pitch 

SFX,1,2,21 

26.  Person 
walking 
in  clogs 

series  of  footsteps  of  person 
walking  at  a  leisurely  pace  in 
wooden  clogs.  Each  step  contains 
two  imnaot  sounds  of  clogs  hitting 
a  flooi . 

SFX,3,1,25 

27.  Ignition 

increasing  pitch  of  car  ignition 

SE,13,A,9 

of  car 
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28.  Chopping  of 
tree 

loud  impact  sound  of  sharp  object 
cutting  and  pounding  into  a  tree 

SFX,1.1,18 

29.  Power  saw 

high  pitched  metallic  grinding 

SFX,7,2,23 

30.  Door  latched 

two  latching  sounds,  slightly  muffled 

SFX,1,2,5 

31.  Cork  popping 

loud  popping  sound 

SFX,5,1.13 

32.  File  cabinet 

sound  of  metallic  wheels  rolling 
on  a  metallic  track  followed  by 
the  closing  of  the  drawer 

SFX, 3,2,6 

33.  Door 
closing 

door  being  slammed  into 
door  frame,  metallic 
lock  closing 

CBS, 2, 2,9 

34.  Cat 
backfire 

one  backfire,  explosive  quality, 
trace  of  sputtering  before 
onset  of  backfire 

SE,13,A,9 

35.  Jail  door 
,  closing 

loud  impact  sound  of  a  heavy 
metallic  door  sliding  shut  with 
loud  click  of  lock  locking 

SFX, 1,2,3 

36.  Rifle  shot 
outdoors 

single  shot,echo 

SE,2,A,19 

37.  Light 
switch 

pull  light  switch  with  two  clicks, 
metallic  sound  at  end 

Recorded 

38.  Stapler 

stapler  being  pressed 

Recorded 

39.  Telephone 
being 
hung  up 

plastic  phone  receiver  being 
dropped  into  its  cradle 

SFX, 5,1, 8 

40.  Sawing  of 
tree 

moderate  sawing  speed,hand  saw 

SFX,1,1,21 

41 .  Electric 
lock 

sequence  of  buzz  and  then  clicking 
sound  of  lock  opening 

SFX,  1,1, 24 
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References  for  sources  of  recordings 


SE,2:  Valentino,  T.J. (Producer).  Sound  Effects  Vol.ll  [Album]. New  York,  N.Y.; 
Thomas  J  Valentino  Inc. 

SE,13:  Valentino,  T.J. (Producer).  Sound  Effects  Vol.XIII[Album].  New  York,  N.Y.; 
Thomas  J  Valentino  Inc. 

AU,4:  Hoizman,  J.(Proaucer).  Authentic  Sound  Effects  Vol.lV[Album].  New  York, 
N.Y.:  The  Elektra  Corporation. 

CBS.1.2.3;  Hoppe,  E.  and  Dulberg,J.(ProducGrs).  The  New  CBS  Audio-File  Sound 
Effects  Library,  Vol.ll  [Aibum]  (1982). New  York.  N.Y.:  CBS  Records.  (CBS,1 
represents  the  first  record  within  the  volume,  CBS, 2  represents  the  second  record, 
and  CBS,3  represents  the  third  record). 

SFX,1.2,3.5,7,8;  White,  V.(Producer).  SFX  Sound  Effects  [Albums]  New  York,  N.Y.: 
Folkways  Records  and  Service  Corp. 
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Appendix  B 


Scales  Used  to  Solicit  Perceptual  and  Cognitive  Ratings 


1 .  Rate  the  identifiability  of  this  sound. 


1 


not  very 
identifiable 


5 

very 

identifiable 


2.  How  easily  does  a  mental  picture  of  this  sound  come  to  mind? 

_ 1 _  2  3  4  5 

not  very  very 

easily  easily 

3.  How  easily  does  the  mental  picture  of  the  person  or  object 
which  caused  this  sound  come  to  mind? 

4.  How  easily  does  the  mental  picture  of  the  action  of  this 
sound  come  to  mind? 


1 


not  very 
easily 


_ 5_ 

very 

easily 


5.  How  necessary  is  it  to  envision  this  sound  in  a  sequence 
of  sounds  in  order  to  identify  it? 

1  2  3  4  5 


not  very 
necessary 


very  necessary 


6.  To  what  extent  is  this  sound  a  necessary  part  of  the 
sequence  in  the  previous  question? 

_ 1 _  _ 2 _  3  4  _ 5 _ 

not  very  very  necessary 

necessary 

7.  How  loud  do  you  think  this  sound  was? 


1 


5 

very  loud 


very  soft 

8.  How  many  sounds  can  you  think  of  which  are  similar  to  this  one? 
1  2  3  4  5 


not  very 
many 


very  many 
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9.  How  many  events  can  you  think  of  which  could  have  caused 
this  sound? 


1 


not  very 
many 


very  many 


1 0.  How  easily  are  you  able  to  think  of  words  to  describe 
this  sound? 


_ 1 _ 

not  very 
easily 


very 

easily 


1 1 .  How  easily  are  you  able  to  think  of  words  which  describe 
the  person  or  object  which  caused  the  sound? 


_ 1 _ 

not  very 
easily 


_ 5_ 

very 

easily 


1 2.  How  easily  are  you  able  to  think  of  words  which  describe 
the  action  which  caused  the  sound? 


_ 

not  very 
easily 


_ 5_ 

very 

easily 


13.  How  similar  is  this  sound  to  your  mental  stereotype? 

_ 1 _  2 _  3  4  5 

very  very 

similar  different 

14.  How  familiar  does  this  sound  seem  to  you? 

_ 1 _  2  3  4  5 

not  very  very 

familiar  familiar 

1 5.  How  clear  was  this  sound  in  quality? 

1  2  3  4  5 


not  very 
clear 


very  clear 
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16.  How  many  sounds  can  you  think  of  that  you  would  place  in 
the  same  category  as  this  one? 

_ 1 _  2  3  4  5 

not  very  very  many 

many 

17.  Rate  the  following  dimensions  according  to  your  feelings 
about  this  sound? 


_ 1 _ 

soft 

_ 1 _ 

3 

4 

5 

hard 

_ 1 _ 

round 

_ 2 _ 

_ ^3 _ 

4 

5 

angular 

1 

2 

3 

4 

5 

dull 

sharp 

_ 1 _ 

relaxed 

_ 1 _ 

3 

4 

_ ^5 _ 

tense 

_ 1 _ 

very 

pleasant 

_ 2 _ 

3 

4 

5 

very 

unpleasant 

_ 1 _ 

interesting 

_ 2 _ 

3  ' 

4 

5 

boring 

_ 1 _ 

compact 

_ 2 _ 

3 

4 

5 

scattered 

18.  Sounds  generally  have  meanings  associated  with  them.  Based  upon  what  you 
think  the  sound  means  -  rate  the  nature  of  the  meaning  on  the  following  scale.  At 
one  end  are  sounds  which  literally  refer  to  only  the  events  which  caused  the  sound 
waves.  At  the  other  end  are  sounds  which  arbitrarily  symbolize  something 
unrelated  to  the  sound  waves.  In  the  middle  are  metaphorical  sounds  whose 
meanings  depend  in  part  on  the  physical  character  of  the  sound  but  which  also 
have  a  meaning  beyond  their  physical  acoustics.  What  is  the  nature  of  the  meaning 
of  the  sound? 

1.  symbolic 

2.  metaphorical 

3.  literal 
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